AITopics | malicious use

Collaborating Authors

malicious use

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Survey on Responsible LLMs: Inherent Risk, Malicious Use, and Mitigation Strategy

Wang, Huandong, Fu, Wenjie, Tang, Yingzhou, Chen, Zhilong, Huang, Yuxi, Piao, Jinghua, Gao, Chen, Xu, Fengli, Jiang, Tao, Li, Yong

arXiv.org Artificial IntelligenceJan-16-2025

While large language models (LLMs) present significant potential for supporting numerous real-world applications and delivering positive social impacts, they still face significant challenges in terms of the inherent risk of privacy leakage, hallucinated outputs, and value misalignment, and can be maliciously used for generating toxic content and unethical purposes after been jailbroken. Therefore, in this survey, we present a comprehensive review of recent advancements aimed at mitigating these issues, organized across the four phases of LLM development and usage: data collecting and pre-training, fine-tuning and alignment, prompting and reasoning, and post-processing and auditing. We elaborate on the recent advances for enhancing the performance of LLMs in terms of privacy protection, hallucination reduction, value alignment, toxicity elimination, and jailbreak defenses. In contrast to previous surveys that focus on a single dimension of responsible LLMs, this survey presents a unified framework that encompasses these diverse dimensions, providing a comprehensive view of enhancing LLMs to better serve real-world applications.

arxiv preprint arxiv, language model, llm, (13 more...)

arXiv.org Artificial Intelligence

2501.09431

Country:

Europe > United Kingdom (0.14)
Asia > China > Beijing > Beijing (0.04)
Asia > Singapore (0.04)
(8 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Government (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

The Malicious Use of Artificial Intelligence: Forecasting, Prevention, and Mitigation

Brundage, Miles, Avin, Shahar, Clark, Jack, Toner, Helen, Eckersley, Peter, Garfinkel, Ben, Dafoe, Allan, Scharre, Paul, Zeitzoff, Thomas, Filar, Bobby, Anderson, Hyrum, Roff, Heather, Allen, Gregory C., Steinhardt, Jacob, Flynn, Carrick, hÉigeartaigh, Seán Ó, Beard, SJ, Belfield, Haydn, Farquhar, Sebastian, Lyle, Clare, Crootof, Rebecca, Evans, Owain, Page, Michael, Bryson, Joanna, Yampolskiy, Roman, Amodei, Dario

arXiv.org Artificial IntelligenceDec-1-2024

This report surveys the landscape of potential security threats from malicious uses of AI, and proposes ways to better forecast, prevent, and mitigate these threats. After analyzing the ways in which AI may influence the threat landscape in the digital, physical, and political domains, we make four high-level recommendations for AI researchers and other stakeholders. We also suggest several promising areas for further research that could expand the portfolio of defenses, or make attacks less effective or harder to execute. Finally, we discuss, but do not conclusively resolve, the long-term equilibrium of attackers and defenders.

ai system, application, vulnerability, (14 more...)

arXiv.org Artificial Intelligence

1802.07228

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.28)
Asia > China (0.14)
Asia > Russia (0.14)
(25 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.67)
Instructional Material > Course Syllabus & Notes (0.45)

Industry:

Transportation > Air (1.00)
Media > News (1.00)
Leisure & Entertainment > Games (1.00)
(13 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
(6 more...)

Add feedback

International Scientific Report on the Safety of Advanced AI (Interim Report)

Bengio, Yoshua, Mindermann, Sören, Privitera, Daniel, Besiroglu, Tamay, Bommasani, Rishi, Casper, Stephen, Choi, Yejin, Goldfarb, Danielle, Heidari, Hoda, Khalatbari, Leila, Longpre, Shayne, Mavroudis, Vasilios, Mazeika, Mantas, Ng, Kwan Yee, Okolo, Chinasa T., Raji, Deborah, Skeadas, Theodora, Tramèr, Florian, Adekanmbi, Bayo, Christiano, Paul, Dalrymple, David, Dietterich, Thomas G., Felten, Edward, Fung, Pascale, Gourinchas, Pierre-Olivier, Jennings, Nick, Krause, Andreas, Liang, Percy, Ludermir, Teresa, Marda, Vidushi, Margetts, Helen, McDermid, John A., Narayanan, Arvind, Nelson, Alondra, Oh, Alice, Ramchurn, Gopal, Russell, Stuart, Schaake, Marietje, Song, Dawn, Soto, Alvaro, Tiedrich, Lee, Varoquaux, Gaël, Yao, Andrew, Zhang, Ya-Qin

arXiv.org Artificial IntelligenceNov-5-2024

I am honoured to be chairing the delivery of the inaugural International Scientific Report on Advanced AI Safety. I am proud to publish this interim report which is the culmination of huge efforts by many experts over the six months since the work was commissioned at the Bletchley Park AI Safety Summit in November 2023. We know that advanced AI is developing very rapidly, and that there is considerable uncertainty over how these advanced AI systems might affect how we live and work in the future. AI has tremendous potential to change our lives for the better, but it also poses risks of harm. That is why having this thorough analysis of the available scientific literature and expert opinion is essential. The more we know, the better equipped we are to shape our collective destiny.

large language model, machine learning, pattern recognition, (26 more...)

arXiv.org Artificial Intelligence

2412.05282

Country:

North America > United States (1.00)
Asia > Middle East (0.92)
Europe > United Kingdom > England > Buckinghamshire > Milton Keynes (0.24)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)
(2 more...)

Industry:

Transportation (1.00)
Media > News (1.00)
Leisure & Entertainment (1.00)
(13 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Networks (1.00)
(13 more...)

Add feedback

The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning

Li, Nathaniel, Pan, Alexander, Gopal, Anjali, Yue, Summer, Berrios, Daniel, Gatti, Alice, Li, Justin D., Dombrowski, Ann-Kathrin, Goel, Shashwat, Phan, Long, Mukobi, Gabriel, Helm-Burger, Nathan, Lababidi, Rassin, Justen, Lennart, Liu, Andrew B., Chen, Michael, Barrass, Isabelle, Zhang, Oliver, Zhu, Xiaoyuan, Tamirisa, Rishub, Bharathi, Bhrugu, Khoja, Adam, Zhao, Zhenqi, Herbert-Voss, Ariel, Breuer, Cort B., Marks, Samuel, Patel, Oam, Zou, Andy, Mazeika, Mantas, Wang, Zifan, Oswal, Palash, Lin, Weiran, Hunt, Adam A., Tienken-Harder, Justin, Shih, Kevin Y., Talley, Kemper, Guan, John, Kaplan, Russell, Steneker, Ian, Campbell, David, Jokubaitis, Brad, Levinson, Alex, Wang, Jean, Qian, William, Karmakar, Kallol Krishna, Basart, Steven, Fitz, Stephen, Levine, Mindy, Kumaraguru, Ponnurangam, Tupakula, Uday, Varadharajan, Vijay, Wang, Ruoyu, Shoshitaishvili, Yan, Ba, Jimmy, Esvelt, Kevin M., Wang, Alexandr, Hendrycks, Dan

arXiv.org Artificial IntelligenceMay-15-2024

The White House Executive Order on Artificial Intelligence highlights the risks of large language models (LLMs) empowering malicious actors in developing biological, cyber, and chemical weapons. To measure these risks of malicious use, government institutions and major AI labs are developing evaluations for hazardous capabilities in LLMs. However, current evaluations are private, preventing further research into mitigating risk. Furthermore, they focus on only a few, highly specific pathways for malicious use. To fill these gaps, we publicly release the Weapons of Mass Destruction Proxy (WMDP) benchmark, a dataset of 3,668 multiple-choice questions that serve as a proxy measurement of hazardous knowledge in biosecurity, cybersecurity, and chemical security. WMDP was developed by a consortium of academics and technical consultants, and was stringently filtered to eliminate sensitive information prior to public release. WMDP serves two roles: first, as an evaluation for hazardous knowledge in LLMs, and second, as a benchmark for unlearning methods to remove such hazardous knowledge. To guide progress on unlearning, we develop RMU, a state-of-the-art unlearning method based on controlling model representations. RMU reduces model performance on WMDP while maintaining general capabilities in areas such as biology and computer science, suggesting that unlearning may be a concrete path towards reducing malicious use from LLMs. We release our benchmark and code publicly at https://wmdp.ai

hazardous knowledge, information, knowledge, (15 more...)

arXiv.org Artificial Intelligence

2403.03218

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > New York (0.04)
North America > United States > Massachusetts (0.04)
(6 more...)

Genre: Research Report > Promising Solution (0.67)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Risk and Response in Large Language Models: Evaluating Key Threat Categories

Harandizadeh, Bahareh, Salinas, Abel, Morstatter, Fred

arXiv.org Artificial IntelligenceMar-22-2024

This paper explores the pressing issue of risk assessment in Large Language Models (LLMs) as they become increasingly prevalent in various applications. Focusing on how reward models, which are designed to fine-tune pretrained LLMs to align with human values, perceive and categorize different types of risks, we delve into the challenges posed by the subjective nature of preference-based training data. By utilizing the Anthropic Red-team dataset, we analyze major risk categories, including Information Hazards, Malicious Uses, and Discrimination/Hateful content. Our findings indicate that LLMs tend to consider Information Hazards less harmful, a finding confirmed by a specially developed regression model. Additionally, our analysis shows that LLMs respond less stringently to Information Hazards compared to other risks. The study further reveals a significant vulnerability of LLMs to jailbreaking attacks in Information Hazard scenarios, highlighting a critical security concern in LLM risk assessment and emphasizing the need for improved AI safety measures.

category, dataset, information hazard, (13 more...)

arXiv.org Artificial Intelligence

2403.14988

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
Asia > Russia (0.14)
Europe > Russia (0.04)
(2 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Law > Criminal Law (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Information Technology > Security & Privacy (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Hazards from Increasingly Accessible Fine-Tuning of Downloadable Foundation Models

Chan, Alan, Bucknall, Ben, Bradley, Herbie, Krueger, David

arXiv.org Artificial IntelligenceDec-22-2023

Public release of the weights of pretrained foundation models, otherwise known as downloadable access [Solaiman, 2023], enables fine-tuning without the prohibitive expense of pretraining. Our work argues that increasingly accessible fine-tuning of downloadable models may increase hazards. First, we highlight research to improve the accessibility of fine-tuning. We split our discussion into research that A) reduces the computational cost of fine-tuning and B) improves the ability to share that cost across more actors. Second, we argue that increasingly accessible finetuning methods may increase hazard through facilitating malicious use and making oversight of models with potentially dangerous capabilities more difficult. Third, we discuss potential mitigatory measures, as well as benefits of more accessible fine-tuning. Given substantial remaining uncertainty about hazards, we conclude by emphasizing the urgent need for the development of mitigations.

arxiv, downloadable model, fine-tuning, (13 more...)

arXiv.org Artificial Intelligence

2312.14751

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York > New York County > New York City (0.04)
(5 more...)

Genre: Research Report (0.47)

Industry:

Information Technology > Security & Privacy (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.68)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

An Overview of Catastrophic AI Risks

Hendrycks, Dan, Mazeika, Mantas, Woodside, Thomas

arXiv.org Artificial IntelligenceOct-9-2023

Rapid advancements in artificial intelligence (AI) have sparked growing concerns among experts, policymakers, and world leaders regarding the potential for increasingly advanced AI systems to pose catastrophic risks. Although numerous risks have been detailed separately, there is a pressing need for a systematic discussion and illustration of the potential dangers to better inform efforts to mitigate them. This paper provides an overview of the main sources of catastrophic AI risks, which we organize into four categories: malicious use, in which individuals or groups intentionally use AIs to cause harm; AI race, in which competitive environments compel actors to deploy unsafe AIs or cede control to AIs; organizational risks, highlighting how human factors and complex systems can increase the chances of catastrophic accidents; and rogue AIs, describing the inherent difficulty in controlling agents far more intelligent than humans. For each category of risk, we describe specific hazards, present illustrative stories, envision ideal scenarios, and propose practical suggestions for mitigating these dangers. Our goal is to foster a comprehensive understanding of these risks and inspire collective and proactive efforts to ensure that AIs are developed and deployed in a safe manner. Ultimately, we hope this will allow us to realize the benefits of this powerful technology while minimizing the potential for catastrophic outcomes.

ai system, ais, safety, (17 more...)

arXiv.org Artificial Intelligence

2306.12001

Country:

Asia > Russia (0.92)
North America > United States > District of Columbia > Washington (0.14)
Europe > Austria > Vienna (0.14)
(22 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Transportation (1.00)
Media (1.00)
Leisure & Entertainment > Games (1.00)
(18 more...)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(6 more...)

Add feedback

A tsunami of AI misinformation will shape next year's knife-edge elections John Naughton

The GuardianAug-12-2023, 15:00:22 GMT

It looks like 2024 will be a pivotal year for democracy. There are elections taking place all over the free world – in South Africa, Ghana, Tunisia, Mexico, India, Austria, Belgium, Lithuania, Moldova and Slovakia, to name just a few. Of these, the last may be the most pivotal because: Donald Trump is a racing certainty to be the Republican candidate; a significant segment of the voting population seems to believe that the 2020 election was "stolen"; and the Democrats are, well… underwhelming. The consequences of a Trump victory would be epochal. It would mean the end (for the time being, at least) of the US experiment with democracy, because the people behind Trump have been assiduously making what the normally sober Economist describes as "meticulous, ruthless preparations" for his second, vengeful term.

knife-edge election john naughton, misinformation, social media, (11 more...)

The Guardian

Country:

North America > United States (0.90)
North America > Mexico (0.25)
Europe > Slovakia (0.25)
(10 more...)

Industry:

Government > Voting & Elections (1.00)
Government > Regional Government > North America Government > United States Government (0.51)
Health & Medicine > Therapeutic Area > Immunology (0.39)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.31)

Add feedback

Malicious use of AI could cause 'unimaginable' damage, says UN boss

The GuardianJul-18-2023, 17:36:43 GMT

Malicious use of artificial intelligence systems could cause a "horrific" amount of death and destruction, the UN secretary general has said, calling for a new UN body to tackle the threats posed by the technology. António Guterres said harmful use of AI for terrorist, criminal or state purposes could also cause "deep psychological damage", and he said AI-enabled cyber-attacks were already targeting UN peacekeeping and humanitarian operations. "The malicious use of AI systems for terrorist, criminal or state purposes could cause horrific levels of death and destruction, widespread trauma and deep psychological damage on an unimaginable scale," Guterres said. Speaking at the first UN security council session on AI, he said the advent of generative AI – the term for AI tools such as ChatGPT that produce convincing text, image and voice from human prompts – could be a defining moment for disinformation and hate speech and add a "new dimension" to the manipulation of human behaviour. Guterres called for the creation of a new UN entity along the lines of the Intergovernmental Panel on Climate Change to tackle the risks.

artificial intelligence, machine learning, natural language, (10 more...)

The Guardian

Country: Europe > United Kingdom (0.79)

Industry:

Government > Military (0.74)
Government > Regional Government > Europe Government > United Kingdom Government (0.34)

Technology:

Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Applied AI (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.81)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.58)

Add feedback

The AI Apocalypse: Will AI Take Over the World?

#artificialintelligenceJan-22-2023, 07:15:12 GMT

Welcome to the future where robots rule the world and humans are relegated to the sidelines. Sounds like a science fiction movie? The rapid advancement of AI technology has sparked a heated debate about the potential consequences of AI and its impact on the future of humanity. But one thing is sure. AI is not just a futuristic fantasy; it's happening right now.

ai technology, artificial intelligence, humanity, (14 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)

Add feedback